Intro Okay, this tutorial will teach the basics of how to crack an encrypted program. We will use a demonstration crackme that I coded. This crackme is a simple nag
elimination. The nag is called from the API MessageBoxA so normal removal would be easy. However, this routine is encrypted and therefore patching would be a
slight problem. I am here to remove this problem and open the world of cracking SMC to the less experienced crackers.
What is encryption? Encryption, is where data is modified in some way to make it unreadable. Within programming this involves having a certain routine, usually at the
start of the program that decrypts the rest of the program before it is executed.
Why use encryption? Applications are usually encrypted to stop reversal of them. If you attempt to decompile an application that is encrypted you are likely to get a
mess of a decompilation that is near impossible to follow and is very little use.
The problem with encryption For the application to run the code must be unencrypted before it is executed. Therefore, it is relativly easy for someone to view the
program in memory while it is running and investigate what is going on. However, a clever program can re-encrypt its data after it has been executed to make
understanding the code a little more difficult.
So what is SMC? SMC stands for 'Self Modifying Code'. This is where an application modifies itself at run-time. An encrypted program is very likely to be use SMC
as it somehow needs to decrypt it's code, which involves rewriting the decrypted program code back into the program, and in essence modifying itself.
Does SMC have any more uses? SMC has many uses within the programming world. However, it is rarely used. SMC code is used within encryption, packers, virii,
but to mention a few common applications. SMC is ideal whenever a program needs to act dynamically to environmental conditions.
Our target
Now for our application. When we run the program we are encountered with a horrible looking nag screen. Now, as we are all used to programming under Win32 we
recognise this dialog as the same dialog that is called by the function MessageBox. Now what you may not know is that their are two versions of the windows
messagebox function. One is 16bit and is left in windows for the old Win3.1 applications to ensure they still operate successfully, the other is an all new, Win32 veresion.
How do we know which is which? Well, this is relativly simple. The guys at Microsoft descided that some new 32bit calls should have the letter A appended onto the
call name. So, as our application is a Win32 application, we can safely guess that the dialog is displayed using the function MessageBoxA.
Now you're probably worrying about not typing MessageBoxA when you use these message boxes in your application, well thats not a problem either as most high level
compilors automatically use Win32 functions where ever possible. If however, you write your programs in a real language such as Assembly, well, then you have a
problem.
Now we know what function the program uses we can successfully attempt to remove the part of the program that attempts to execute this instruction. To do this
successfully we will need to know what paramaters the function expects, and what return value it gives. So we open up our Windows API reference. Find the details
for the call MessageBox (Yep, the windows guys don't bother diferenciating between Win16 and Win32). You should see the following format for the function:
int MessageBox(
HWND hWnd, // handle of owner window
LPCTSTR lpText, // address of text in message box
LPCTSTR lpCaption, // address of title of message box
UINT uType // style of message box
);
Now a little Win32asm knowledge is essential. We need to know how this function would be executed in an assembley program. Here is a sample code snippet of an
example of a MessageBox call:
push MB_OK ;style of message box
push offset MsgTitle ;address of the title of message box
push offset MsgText ;address of text in message box
push hwnd ;handle of owner window
call MessageBoxA ;execute the function
You should notice that the paramaters for the function are pushed onto the stack backwards and then the call is executed. What is the stack? The stack is a form of
temporary storage available to programmers. The stack on a LILO basis. Therefore when you attempt to get a value back of the stack you will recieve the last value
you pushed onto the stack. For example:
push Value1 ;put our first value on top of the stack
push Value2 ;put our second value on top of the previous value in the stack
Push Value3 ;put the last value on top of all the others
pop Value3 ;restore the third value
pop Value2 ;restore the second value
pop Value1 ;and restore the first value
Normally, when a windows function is executed it will 'pop' all the paramaters of the stack, leaving your stack as it was before you attempted to push all the functions
paramaters and to call the function. This does, however, depend on the calling convention used, but dont worry. Windows nearly always operates this way. This
knowledge of the stack that i have just given you should be enough for most of your reverse engineering needs.
So know we know all the theory behind the programs displaying of the message box, we need to make our debugger stop at the point in the program where this function
will be executed. I use SoftICE as my debugger, and so should everybody. SoftICE is available from many 'warez' sites throughout the internet and I am informed that
the latest version is 4.01, but have only ever seen and used version 4. I do, however, not condone piracy so you should buy this wonderfull application from
Assuming that you have successfully installed SoftICE, and included all the relevant exports (get another tut on this, their are MANY), we need to tell the debugger to
break on execution of the function MessageBoxA. Firstly enter SoftICE by typing 'CTRL-D' from within windows. You should now be presented with the glorious
world of SoftICE. Now type 'BPX MessageBoxA' within SoftICE. You should recieve no error messages. If you do then you've not set up SoftICE correctly. Now we
press 'CTRL-D' again to exit SoftICE and then when back in Windows we go and attempt to execute our application. Allmost as soon as you run the application you
should be poped back into SoftICE.
No we are stuck deep within the depths of the windows code. We want to return to the process that origionally called the function (our target application). We press
'F12' to return to the section of code that called this and we should be shoved back in windows and see that horrible looking nag screen. If you click on the 'OK' button
you will be back in SoftICE. Below the code window you should see the name of our application looking something like: n0p3x!CODE+### where ### is any sequence
of hexadecimal numbers. Now we are where we want to be. Take a little look around this section of code, go on, it won't bite. Get a REAL feel for the code. By
pressing the 'CTRL' key and the up and down arrows you can scroll up and down the code to your hearts content.
You should see the four values that are pushed onto the stack before the call. If you want to investigate some of the paramaters to ensure that this is the correct
message box you can type 'D [MEMORY ADDRESS]' where message address is the number trailing the push statement. If you do this for the second and third push
you should see both thge message box's title and the message box's text. Okay enough messing around. Wee need to remove this whole call. This means removing all of
the pushes aswell otherwise we will mess up the programs stack and probably crash.
Removing the call is easy. Their is a special command called 'NOP' which when executed does nothing but a small delay. (don't worry, when i say small, I mean
SMALL). We sinply need to replace the whole call and pushes with a few NOP's. However, a NOP is only one byte long, and no doubt our message box takes more
than one byte. We simply replace every byte with a NOP as we don't want to mess up the program size as this will DEFINATELY result in a crash. How do we know
how many bytes our function takes? Well this is quite simple. If you type 'CODE ON' from within SOftICE you will see the opcodes for every command. It should look
a little like this:
6A00 push 00 ;the message box stlyle
6800204000 push 402000 ;the caption
6885204000 push 402085 ;the text
6A00 push 00 ;the parent windows handle
E84E000000 call USER32!MessageBoxA
Don't worry that the parent windows handle is 0. This just means that the message box doesnt have a parent window. One byte is two numbers. So we can tell that the
first push is two bytes and would require replacing by two NOP's. We need to find this section of code in our compiled exe within a hex editor. I use HIEW and Hex
Workshop. Although Hew Workshop is probably better for beginners. Within the hex editor search for the hex byte sequence '6A006800204000'. This is simply the
opcodes for the first two pushes put together. If you found this byte sequence then you should search again to see is the sequence exists elsewhere within the program,
if it does then you need to increase your byte sequence to search for by adding on more opcodes from the rest of the code until you get a nice match. Once you find the
code you would replace every byte above in the EXE with the opcode 90. 90 is the opcode for NOP but i'm pretty sure that most people know this.
However, we have a problem. The byte sequence doesn't exist. What has happened here? Well, the program is encryprted in some way to try and stop you altering it. If
the program wasn't encrypted then you would probably found the bytes okay and the program would work as expected.
It is a safe bet that the program is decrypted before it is executed. We simply need to find out where this happens to see what is going on. Open the program in a
disasembler. I use WinDasm 8.9. You need to find the decryption routine. The routine should look like this:
mov reg1, addr-to-write-to
mov reg2, [reg1]
;manipulate reg2
mov [reg1], reg2
We know that this must be executed before the message box is displayed otherwise the program would attempt to execute the encrypted bytes, which would most likely
crash. We know that the first push is at address 4011b0 in memory. Simply go to that code location in your dissasembler and should see a mass of stupid instructions.
This is the encrypted version of our message box function. If we simply backtrace from this point we will see where the program rewrites this data. If you scroll up a
little you should see the following text:
*Referenced by a CALL at address:
:00401020
So we know where this little message box procedure is called from. Go to that address in your dissasembler. Investigate this new area of code you have landed in.
Notice it is very near to the start of the program and that only four functions are called before our message box. This doesn't leave much room for our decryption to
hide. As three of these functions are windows functions we know the decryption routine should reside in the unnamed call, which, coincidently is called the line before
our message box. Go to the address that the call is pointing at in your dissasembler. You should see a very suspicios looking routine indeed. It should look like this:
* Referenced by a CALL at Address:
|:0040101B
|
:00401194 B8AB114000 mov eax, 004011AB
* Referenced by a (U)nconditional or (C)onditional Jump at Address:
|:004011A8(U)
|
:00401199 8A18 mov bl, byte ptr [eax]
:0040119B 80F301 xor bl, 01
:0040119E 8818 mov byte ptr [eax], bl
:004011A0 40 inc eax
:004011A1 3DC3114000 cmp eax, 004011C3
:004011A6 7F02 jg 004011AA
:004011A8 EBEF jmp 00401199
* Referenced by a (U)nconditional or (C)onditional Jump at Address:
|:004011A6(C)
|
:004011AA C3 ret
From your basic knowledge of asm you should be able to translate this whole section of code. But i'll just skim through it to make sure :-). Okay, Firstly the mov eax,
004011AB moves the address of the start of the encrypted code into a register. Then the mov bl, byte ptr [eax] moves the first byte of the encrypted code into the
register bl. The xor bl, 01 encrypts the byte ;-). The inc eax increments the value in the register eax to point to the next byte of the encrypted routine. The program
then (cmp eax, 004011C3) checks if we are at the end of the encrypted routine, and if we are at a greater address then it (jg 00401AA) jumps to a single ret which
exits the function. If we are not at the end of the encrypted data then we (jmp 00401199) jump back to the start of the code and decrypt the next byte.
So, from this you WILL be able to tell that all the program does is xor each byte with the number 1. This leaves us many ways to attack the application. We could simply
decrypt the bytes by hand and NOP out the whole decryption routine, write encrypted NOP's in place of the message box call, or simple NOP both routines as none of
them perform anything except the nag screen.
We will opt for the harder option. We will inject encrypted code back into the file. As we know where the encrypted code resides in Windasm we can find the offset
easily. If you click on the line of code that the first push should be if it wasn't encrypted the status bar should display an offset. In a hex editor just tell it to go to the
offset. Now, Windasm shows us that the encrypted code looks like:
All we need to do is to replace each byte with our encrypted NOP. If you do a simple 90h xor 1h you should get the encrypted value for the NOP. It's 91h. Now write
the bytes back to the file in a hex editor. Run the program and voila. It runs minus a nag screen.
Extra notes
Unfortunately, most encryption routines are a lot more complex than a simple xor. However, once you know the basics it's all the same theory, just in a more complex
mannor. Sometimes there may be more than one decryption routine. Sometimes the whole program (including resource) will be encrypted. However, all have a
decryption routine. It is though, possible for the deryption routine to be encrypted itself, and to have its own decryption routine, but that will involve the same theory, just
take more time.
Some people may complain about me just using NOP's to patch with. I don't see why. They complain that some programs have CRC checks, but this is very rare in
encrypted programs, besides, any CRC that can be coded, can be cracked :-). Another tut i think :-)